The Jupyter is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.
We will be using a Jupyter server as the primary web interface for this workshop. Several notebooks have been provided to you, in advance, to guide you through the workshop. After the workshop, you may use the Agave Jupyter image to recreate the notebook server and repeat the workshop, or continue on with your own work at your leisure.
The Agave image has several customizations to facilitate use of the platform and ease much of the heavy lifting done behind the scenes in this tutorial.
Your Jupyter server has multiple kernels available for use right away. We have preconfigured them with several useful libraries and tools to help users get up and running with common tasks easier. Additionally, we have bundled in Agave CLI and Python SDK into the Bash, Python 2, and Python 3 kernels respectively. Both kernels are pre-authenticated with valid Agave auth tokens that you can use to begin interacting with the Agave Platform right away.
Your home directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.
Jupyter contains a web terminal that can be used to access your sandbox environment or interact with the Jupyter container itself. To login to your sandbox from the Jupyter web terminal, simply run the following command:
ssh -p 10022 $VM_IPADDRESS
This tutorial is presented as a series of Jupyter notebooks. If you are attending this tutorial in person, you will download the notebooks into the home directory of your notebook server. If you are following along after the fact, you should download the notebooks from the github repository into your Jupyter workspace.
git clone --depth 1 https://github.com/agaveplatform/SC17-container-tutorial.git
The tutorial walks you through the process of obtaining a set of API keys an authenticating to the Agave Platform. Once this is done, you no longer need to authenticate to follow the tutorial. Both the Agave CLI and Python SDK will be picked up your authorization cache and automatically refresh it as needed.
Inside of the examples
directory, you will find several notebooks to help you learn more about the Agave platform, containers, and SciOps. We leave these for you to follow after the tutorial.
The tutorial sandbox is a full Ubuntu 16.04 server running as a Docker container on a VM dedicated for your use in this tutorial. The sandbox has a standard HPC build environment with OpenMPI, Python 2, Python 3, build-essential, gfortran, openssl, git, jq, vim, and a host of other utilities.
Docker and Singularity are both pre-installed in your Sandbox. All images used in this tutorial are available from the public Agave Docker Hub and Singularity Hub accounts. You may also use your own private registry accounts. You will need to login to the respective registries on your own.
The sample code for this project is already present in $HOME/FUNWAVE-TVD
.
Your $HOME/work
directory on the Jupyter server is shared with your sandbox, so you can safely copy data between the two environments quickly and easily.
To login to the sandbox from outside the Jupyter server, use the host IP address. You will find the public IP address of your sandbox in the $VM_IPADDRESS
environment variable. Valid ssh keys are available in the ~/.ssh
director of your Jupyter server. Alternatively, you can append your own public key to the $HOME/.ssh/authorized_keys
file.
ssh -i /path/to/private/key.pem -p 10022 jovyan@$VM_IPADDRESS
Your VM will remain available for 1-2 days following the tutorial. During that time, your data will remain available. After that, the VM an any data saved with it will be destroyed. If you need to persist your data, it is recommended that you move it to another host, or create your own account in the Agave public tenant and save your data in the free cloud storage provied to you by default there.
We have already configured resources for you to use in this tutorial.
Each of you have a dedicated VM provided by the Nectar Cloud. You will use this VM for the duration of the tutorial.
A training account on the Agave Platform's public tenant has also been allocated to you.
Your Jupyter server is available at <username>.sc17.training.agaveplatform.org
.
Usernames will be training001 to training100. We will count off to determine our instance.
When you first login, you will find it empty, save for a notebook named INSTALL.ipynb". Open this notebook by clicking on the notebook name, then click the "run" button. This will fetch all the tutorial notebooks from the tutorial's git repository an add them to your workspace.
Once complete, open the Config notebook to being the meat of our tutorial.
If you are following along with this tutorial at home, you can recreate the tutorial Jupyter server and sandbox environments by running the containers on your own server using the following Docker Compose file (i.e. save the file below in a file named docker-compose.yml
).
version: '2'
volumes:
training-volume:
services:
jupyter:
image: agaveplatform/jupyter-notebook:latest
command: start-notebook.sh --NotebookApp.token=''
mem_limit: 2048m
ports:
- '8888:8005'
environment:
- VM_MACHINE=training-node-${AGAVE_USERNAME}
- VM_HOSTNAME=localhost:8888
- USE_TUNNEL=True
- ENVIRONMENT=training
- SCRATCH_DIR=/home/jovyan
- MACHINE_USERNAME=jovyan
- MACHINE_NAME=sandbox
- DOCKERHUB_NAME=stevenrbrandt
- AGAVE_APP_DEPLOYMENT_PATH=agave-deployment
- AGAVE_CACHE_DIR=/home/jovyan/work/.agave
- AGAVE_JSON_PARSER=jq
- AGAVE_USERNAME=${AGAVE_USERNAME}
- AGAVE_PASSWORD=${AGAVE_PASSWORD}
- AGAVE_SYSTEM_SITE_DOMAIN=localhost
- AGAVE_STORAGE_WORK_DIR=/home/jovyan
- AGAVE_STORAGE_HOME_DIR=/home/jovyan
- AGAVE_APP_NAME=funwave-tvd-sc17-${AGAVE_USERNAME}
- AGAVE_STORAGE_SYSTEM_ID=nectar-storage-${AGAVE_USERNAME}
- AGAVE_EXECUTION_SYSTEM_ID=nectar-exec${AGAVE_USERNAME}
volumes:
- training-volume:/home/jovyan/work
- ../notebooks:/home/jovyan/notebooks
sandbox:
image: agaveplatform/sc17-sandbox:latest
mem_limit: 2048m
privileged: True
ports:
- '10022:22'
environment:
- VM_MACHINE=training-node-${AGAVE_USERNAME}
- NGROK_TOKEN=${NGROK_TOKEN}
- USE_TUNNEL=True
- ENVIRONMENT=training
- AGAVE_CACHE_DIR=/home/jovyan/work/.agave
volumes:
- training-volume:/home/jovyan/work
- /var/run/docker.sock:/var/run/docker.sock
- $HOME/.docker:/home/jovyan/.docker:ro
To run the above, you need to first set the environment variables
AGAVE_USERNAME
,AGAVE_PASSWORD
, andNGROK_TOKEN
. The first two should be your agave username and password as obtained from Agave TOGO. The ngrok token should be obtained from ngrokNgrok will provide tunnelling for you so that agave can ssh into your laptop or desktop machine. It will do this by setting the
M_IPADDRESS
,VM_HOSTNAME
andVM_SSH_PORT
for you.Once you have these things setup, you should be able to run
docker-compose up
(note: you should run this command from the same directory in which you created yourdocker-compose.yml
file) you should then be able use your brower to connect to the tutorial setup on port 8888 of your local machine (http://localhost:8888).